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Abstract — Distributed Opportunistic Scheduling (DOS) is in- 
herently harder than conventional opportunistic scheduling due 
to the absence of a central entity that has knowledge of all the 
channel states. With DOS, stations contend for the channel using 
random access; after a successful contention, they measure the 
channel conditions and only transmit in case of a good channel, 
while giving up the transmission opportunity when the channel 
conditions are poor. The distributed nature of DOS systems 
makes them vulnerable to selfish users: by deviating from the 
protocol and using more transmission opportunities, a selfish user 
can gain a greater share of the wireless resources at the expense 
of the well-behaved users. In this paper, we address the selfishness 
problem in DOS from a game theoretic standpoint. We propose 
an algorithm that satisfies the following properties: (i) when all 
stations implement the algorithm, the wireless network is driven 
to the optimal point of operation, and (ii) one or more selfish 
stations cannot gain any profit by deviating from the algorithm. 
The key idea of the algorithm is to react to a selfish station by 
using a more aggressive configuration that (indirectly) punishes 
this station. We build on multivariable control theory to design a 
mechanism for punishment that on the one hand is sufficiently 
severe to prevent selfish behavior while on the other hand is light 
enough to guarantee that, in the absence of selfish behavior, the 
system is stable and converges to the optimum point of operation. 
We conduct a game theoretic analysis based on repeated games to 
show the algorithm's effectiveness against selfish stations. These 
results are confirmed by extensive simulations. 



I. Introduction 

Opportunistic scheduling techniques have been shown to 
provide substantial performance improvements in wireless 
networks. These techniques take advantage of the fluctuations 
in the channel conditions of the different wireless stations; by 
selecting the station with the best instantaneous channel for 
data transmission, opportunistic scheduling can utilize wireless 
resource more efficiently. A key assumption of most oppor- 
tunistic scheduling techniques H], 121 is that the scheduler is 
centralized and has knowledge of the instantaneous channel 
conditions of all stations. 

Distributed Opportunistic Scheduling (DOS) techniques 
Il3|-|l6| have been proposed only recently. In contrast to 
centralized schemes, with DOS each station has to make 
scheduling decisions without knowledge of the channel con- 
ditions of the other stations. Stations contend for the channel 
using random access with a given access probability. After 
successful contention, a station measures the channel and, in 
case of poor channel conditions (i.e., when the instantaneous 
transmission rate is below a given threshold), the station 
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gives up the transmission opportunity. This allows all stations 
to recontend for the channel, letting a station with better 
conditions win the contention, which increases the overall 
throughput. In this way, DOS techniques exploit both multi- 
user diversity across stations and time diversity across slots. 

The absence of global channel information makes DOS 
systems very vulnerable to selfish users. By deviating from 
the above protocol and using a more aggressive configuration, 
a selfish user can easily gain a greater share of the wireless 
resources at the expense of the other, well-behaved users. In 
this paper, we address the selfishness problem in DOS from a 
game theoretic standpoint. In our formulation of the problem, 
the players are the wireless stations that implement DOS 
and strive to obtain as much resources as possible from the 
wireless network. We show that, in the absence of penalties, 
the wireless network naturally tends to either great unfairness 
or network collapse. Following this result, we design a penalty 
mechanism in which any player who misbehaves will be pun- 
ished by other players in such a way that there is no incentive 
to misbehave. A key challenge when designing such a penalty 
scheme is to carefully adjust the punishment inflicted upon a 
misbehaving station. On the one hand side, if the punishment is 
too light, a selfish station may still benefit from misbehaving. 
On the other hand, an overreaction may itself be interpreted as 
misbehavior and could trigger punishment by other stations, 
leading to an endless spiral of increasing punishments and 
a throughput collapse. Addressing this challenge through a 
combination of game theory and multivariable control theory 
is a key part of our design. 

The most relevant prior work on DOS by Zheng et al. JS) 
sets the basic foundations of distributed opportunistic schedul- 
ing. The authors propose a mechanism based on optimal 
stopping theory and analyze its performance both with well- 
behaved and with selfish users. The aim of the algorithm is to 
maximize the total throughput of the network. ||4l-||6| extend 
the basic mechanism of ID by analyzing the case of imperfect 
channel information 14), improving channel estimation through 
two-level channel probing [5j, and incorporating delay con- 
straints |l6l. While our algorithm deals with the basic DOS 
mechanism of ||3|, it could be extended with the enhancements 
of ||4l-||6l. The key contributions of our work are: 

1 ) We perform a joint optimization of both the transmission 
rate thresholds and the access probabilities, while lO 
only optimizes the thresholds. 

2) We provide a proportionally fair allocation that achieves 
a good tradeoff between total throughput and fairness, 
while O maximizes the total throughput of the network, 
which may lead to starvation of the stations with poor 



channel conditions. 

3) We propose a simple algorithm based on control theory 
that guarantees stability and quick convergence to the 
optimal point of operation, in contrast to the compara- 
tively complex heuristics of fS). 

4) Our game theoretic analysis considers that users can 
selfishly configure both their access probability and 
transmission rate threshold, while the analysis of PI 
assumes that selfish users can only maliciously configure 
the thresholds. 

5) We use a penalty mechanism to force an optimal Nash 
equilibrium, while (3] introduces a pricing mechanism 
for this purpose, which may not be practical in many 
scenarios; additionally, the performance of the pricing 
mechanism heavily depends on the cost parameter and 
even in the best case is only suboptimal. 

The remainder of the paper is organized as follows. In 
Section In] we present an analysis of our system and derive the 
optimal configuration of access probabilities and transmission 
rate thresholds. In Section Hill we show that, in the absence of 
penalties, the wireless network tends to a highly undesirable 
resource allocation; based on this, we propose an algorithm 
named Distributed Opportunistic scheduling with distributed 
Control (DOC) that avoids this situation by implementing a 
decentralized penalty mechanism that controls selfish behavior 
through punishments. Section |IV] shows by means of control 
theory, that when all the stations implement DOC, the system 
stably converges to the optimal point of operation obtained in 
Section ini In Section [V] we conduct a game theoretic analysis 
of DOC to show that stations cannot gain any profit from 
behaving selfishly. The performance of the proposed scheme 
is extensively evaluated through simulations in Section |VI] 
Finally, Section IVTIl provides some concluding remarks. 
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II. Analysis and Optimal configuration 

In the following we present our system model and analyze 
the throughput as a function of the access probabilities and 
transmission rate thresholds. We then compute the optimal 
configuration of these parameters for a proportionally fair 
throughput allocation, which is well known to provide a good 
tradeoff between total throughput and fairness 13 ■ 

A. System Model 

Our system model follows that of |l3|-|l6l. We consider a 
single-hop wireless network with N stations, where station 
i contends for the channel with an access probability pi. A 
collision model is assumed for the channel access, where the 
channel contention of a station is successful if no other station 
contends at the same time. Let r denote the duration of a mini 
slot for channel contention, which can either be empty, or can 
contain a successful contention or a collision. 

As in |l3|-|l6l, we assume that a station i obtains its local 
channel conditions after a successful contention. Let Ri{6) 
denote the corresponding transmission rate at time 9. If Ri {6) 
is small (indicating a poor channel), station i gives up on this 
transmission opportunity and lets all the stations recontend. 
Otherwise, it transmits for a duration of T. Fig. [T] depicts 




t 
Ri(g)<R, R,(e)>R, 

Fig. I. Channel contention example. 



an example of such channel contention. Our model, like 
that of |[3l-|l6l, assumes that Ri{d) remains constant for the 
duration of a data transmission and that different observations 
of Ri{6) are independent!^ From |3|, we have that the optimal 
transmission policy is a threshold policy: for a given threshold 
Ri, station i only transmits after a successful contention if 

R^iO) > K 



B. Throughput Analysis 

The throughput r^ achieved by station i is a function of the 
parameters pi and Ri. Let li be the average number of bits 
that station i transmits upon a successful contention and Ti be 
the average time it holds the channel. Then, the throughput of 
station i is 

Ps^iH 



Y.tPs,jTj + {1~Ps)t 



(1) 



where ps.i is the probability that a mini slot contains a 
successful contention of station i 



Ps,t =ptY[{i ~P]) 



(2) 



and ps is the probability that it contains any successful 
contention 

Ps = ^Ps,i (3) 

i 

Both li and Ti depend on Ri. Upon a successful contention, 
a station holds the channel for a time 7~+t in case it transmits 
data and r in case it gives up the transmission opportunity. 
Thus, Ti can be computed as 

T, = Prob{R^{9) < R^)t + Prob{R,{9) > R^){T + t) (4) 

In case the station uses the transmission opportunity, it 
transmits a number of bits given by Ri{9)Ti, which yields 



h = / rTifR.{r)dr 
I R, 



(5) 



where fRi{r) is the pdf of Ri{9). 

With the above, we can compute r^ from p = {pi, . . . ^pn} 
and R = {^i, • • ■ , Rn}- In the following, we obtain the opti- 
mal configuration of these parameters to provide proportional 
fairness. 

' The assumption that Ri{9) remains constant during a data transmission is 
a standard assumption for the block-fading channel in wireless communica- 
tions (8), (9), while the assumption that different observations are independent 
is justified in (3] through numerical calculations. 



C. Optimal pi configuration 

We start by computing the optimal configuration of pi. Let 



us define Wi as 






(6) 



where we take station 1 as reference. From the above equation 
we have that Ps,i = WiPs/Ylii''^j ^^^ substituting this in 
Eq. ([U yields 

WiPsh 

Ej ^']Ps Tj + Ej "'j ( 1 - Ps ) T 

In a slotted wireless system such as the one of this paper, 
the optimal success probability is approximately 1/e ifTOI . The 
problem of finding the p configuration that maximizes the 
proportionally fair rate allocation is thus equivalent to finding 
the values Wi that maximize ^^ log{ri) given that ps = 1/e. 
To obtain these Wi values, we impose 



dw. 







which yields 



J_ ^j PsTi + {l-Ps)T 

Wi Ei WiPsTi + J2o Wj (1 - Ps)t 



= 



Combining this expression for Wi and Wj, we obtain 

PsTj + (1 -Ps)t 






(8) 



(9) 



(10) 



From the above, the solution to the optimization problem is 
given by the values of p resulting from solving the following 
system of equations: 

T.p^]li^-ps) = l (11) 

i j^i 

v-^Wj^i 1 - Pj ^i + •^(iM - 1) ' 

This system of equations has two solutions, since 1/e is 
only an approximation to the truly optimal success probability. 
For one of the solutions, all of the access probabilities are 
larger than the corresponding ones from the other We select 
the solution with the larger access probabilities, denoted by 
P* ~ {Pi' • ■ • 'Pjv}' ^ii'l refer to them as the optimal access 
probabilities. 

Note that determining p* above requires computing T, Vi, 
which depend on the optimal configuration of the thresholds 
R. In the following section we address the computation of the 
optimal R, which we denote by R* = {R\, . . . , R*j^}. 

D. Optimal Ri configuration 

In order to obtain the optimal configuration of R, we need 
to find the transmission threshold of each station that, given 
the p* computed above, optimizes the overall performance in 
terms of proportional fairness. This is given by the following 
theorem. 

Theorem 1. Let us consider that station k is alone in the 
channel and it contends for the channel with pk = 1/e. 



Let R\ be the transmission rate threshold that optimizes the 
throughput of this station under the assumption that different 
channel observations are independent. Then, R^. = R\. 

Proof: The proof is by contradiction. Assume that there 
exists a configuration R* with R^, ^ R\ for some station k 
that provides proportional fairness. 

Let l], and T^ be the values of Ik and Tk for the threshold 
R\ and Z^ and T^ the corresponding values for R^. Since R\. 
maximizes r^ when station k is alone: 
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1* 



Tl + (e - l)r T* + (e - l)r 



(13) 



Let us consider that there are N stations in the network and 
the configuration R* is used. Given R*, the p* that maximizes 
X^i log{ri) is given by Eqs. (fTTI) and (fT2l i. This leads to the 
following throughput for station k: 



PtJl 



Y.,P*sAT* + (e - 1)t) nit* + (e - l)r) 



and for the other stations: 



I* 



N{T* + {e- l)r) 



Vi^k 



(14) 



(15) 



Let us now consider the alternative configuration /j^ for 
station k and R* for the other stations. Let us take the pi 
and pI configuration that satisfies Eqs. ( fTTT i and ( fT2] i with this 
alternative configuration. This yields the following throughput 
for station k: 



ll 



N{Tl + (e - l)r) 



>rl 



and for the other stations: 



I* 



N{T: + (e - l)r) 



yi^k 



(16) 



(17) 



With the above, we have found an alternative configuration 
that provides a higher throughput to station k and the same 
throughput to all other stations. Therefore, this alternative con- 
figuration increases J^i^'^di^i)^ which contradicts the initial 
assumption that the configuration R* provides proportional 
fairness. ■ 

Following the above theorem, the optimal configuration of 
the thresholds R* can be computed based on optimal stopping 
theory. This is done in |3| which finds that the optimal 
threshold R* can be obtained by solving the following fixed 
point equation: 



E(R,ie)~R*) 



R* 



T/e 



(18) 



The above concludes the search for the optimal configura- 
tion. The key advantage of this configuration is that it allows 
each station to compute its R* based on local information 
only, which decouples the computation of R* from that of p*. 
Based on this finding, we now present a distributed mechanism 
to compute the optimal configuration where each station uses 
a fixed Ri = R* obtained locally, together with an adaptive 
algorithm to determine the optimal p*. 



III. DOC Algorithm 

In this section we propose an adaptive algorithm that 
satisfies the following properties: (i) when all stations im- 
plement the algorithm, it leads to the optimal configuration 
computed above, and (m) a selfish station cannot gain profit by 
deviating from the algorithm. We first motivate our algorithm 
by showing that, in the absence of punishments, the system 
will naturally tend to a highly undesirable point of operation. 
Then, we present our algorithm which uses punishments to 
drive the system to the optimal point of operation obtained in 
the previous section. 

A. Motivation 

If no constrains are imposed on the wireless network and 
stations are allowed to configure their {pi,Ri} parameters to 
selfishly maximize their profit, the network will not naturally 
tend to the optimum configuration above. In order to show 
this, we model the wireless system as a static game in which 
each station can choose its configuration without suffering 
any penalty. The following theorem characterizes the Nash 
equilibria of this game. 

Theorem 2. In absence of penalties, there is at least one 
station that plays pi = 1 in any Nash equilibrium. 

Proof: The proof is by contradiction. Assume that there 
is a Nash equilibrium such that pj ^ 1 Vj. Now take one 
player i with throughput 



PjTj + (1 -pi)T^ 



(19) 



where Ti is the average duration the channel is occupied 
when station i transmits and T^i is the average duration of 
a transmission or an empty mini slot when station i does not 
transmit. 

Taking the partial derivative we have 



drj 
dpi 



(pif, + {I - p,)f^, 



>o 



(20) 



It can be seen that the throughput ri is a strictly increasing 
function of pi (given that the Rj configuration as well as the 
configuration of the other stations does not change). 

From the above follows that {pi,Ri}, with pi ^ 1, is not 
the best strategy for player i given the configuration of the 
other stations, since i would obtain a higher throughput for 
Pi = 1 and the same Ri. Thus, this solution is not a Nash 
equilibrium, which contradicts our initial assumption. ■ 

Any of the above Nash equilibria are highly undesirable. 
If station i is the only one that plays pi = 1, then player 
i achieves non-zero throughput while all other players have 
zero throughput. Conversely, any other station j also playing 
Pj = 1 results in a network collapse and all players obtain 
zero throughput. 

We conclude from the above that, in the absence of punish- 
ments, selfish behaviors will severely degrade the performance 
of the wireless system. In the following, we propose an 
algorithm that addresses this problem by implementing a 
distributed punishment mechanism. 



B. Rationale behind the algorithm 

Before presenting the algorithm, we first discuss the ratio- 
nale that lies behind its design. This rationale heavily relies on 
the notion of channel time that a station obtains over a certain 
interval, defined as 



t. = E(^'(-?') + (e-i)^) 



(21) 



where rii is the number of successful contentions of station i 
in that period and Ti{j) is the duration of the j*'* successful 
contention of the station. The above definition comprises 
the aggregated transmission time of the station plus a fixed 
overhead of (e — 1)t that is added every time the station 
accesses the channel. 

An important observation that drives the design of our 
algorithm is that, with the configuration of Section [III all 
stations receive the same channel time, i.e., ti ~ tj yi,j. This 
can be seen as follows. From Eq. (|2TI) we have that over a 
given interval. 



U _ n, [T, + (e - l)r) _ p,,,(r, + (e - l)r) 
t, n,{T, + {e-l)r) p,,, (T, + (e - l)r) 



(22) 



since by definition ni/rij — ps^i/ps.j- Furthermore, from 
Eq. O we have p,^,{T, + (e - l)r) = Ps,j{Tj + (e - l)r) 
and thus ti = tj. 

Since the overhead in the definition of channel time, (e — 
l)r, coincides with the average time between two successes 
for the optimal configuration, from the above follows that, 
when all stations use the optimal configuration over a given 
time interval TtotaU they all observe the same optimal channel 
time t* , 

t* - Ttotai/N (23) 

The last observation upon which our algorithm relies is that 
as long as a selfish station does not receive more channel time 
than t*, it cannot increase its throughput. The throughput of 
a station with a given channel time and Ri is equal to the 
throughput it would obtain if it were alone in the channel 
during this time with pi = 1/e and the same Ri. From 
Theorem [T] we have that this throughput is maximized for 
the optimal transmission rate threshold R*. Therefore, as long 
as the station does not receive extra channel time, it will not 
be able to achieve a higher throughput. 

Given these observations, we base our algorithm on the 
following principles: (i) if a given station i detects that another 
station k is receiving a larger channel time than itself, then 
station k is considered selfish and punished by station i, and 
(m) when punishing station k, station i needs to make sure the 
punishment is severe enough so that station fc's channel time 
remains below t* and thus it cannot benefit from misbehaving. 

C. Algorithm design 

The DOC algorithm aims at driving the system to the 
optimal configuration {p*,R*} obtained in Section HH As 
discussed in Section III-DI the optimal configuration of Ri can 
be computed locally by each station independently of the other 
stations. Therefore, with DOC each station maintains a fixed 




Fig. 2. DOC control system. 

Ri (equal to the optimal value) and implements an adaptive 
algorithm to configure its access probability pi. 

Time is divided into intervals in such a way that each 
station updates its access probability pi at the beginning of 
an interval. The central idea behind DOC is that when a 
misbehaving station is detected, the other stations increase 
their access probabilities in subsequent intervals to prevent 
the selfish station from benefiting from the misbehavior. 

A key challenge in DOC is to carefully adjust the reaction 
against a selfish station. If the reaction is not severe enough, 
a selfish station may benefit from misbehaving, but if the 
reaction is too severe, the system may turn unstable by entering 
an endless loop where all stations indefinitely increase their 
Pi to punish each other 

Control theory is a particularly suitable tool to address this 
challenge, since it helps to guarantee the convergence and 
stability of adaptive algorithms. We use techniques from multi- 
variable control theory ifTTl for the design of the DOC algo- 
rithm. The algorithm is based on the classic system illustrated 
in Fig. 121 where each station runs an independent controller 
in order to compute its configuration. The controller that we 
have chosen for this paper is a proportional-integral (PI) 
controller, a well known controller from classic control theory 
that has been used by a number of networking algorithms in 
the literature jTH-lfTtl. 

As shown in the figure, the PI controller of station i takes 
the error signal Ei as input and provides the control signal 
Pi as output. The error signal serves to evaluate the state of 
the system. If the system is operating as desired, the error 
signal of all stations is zero. Otherwise, the error is non-zero 
and we need to drive the system from its current state to the 
desired point of operation. In order to do this, the PI controller 
adjusts the control signal Pi by (appropriately) increasing it 
when Ei > and decreasing it otherwise. In the following, 
we address the design of Pi and Ei. 

D. Control signal Pi 

The goal of the adaptive algorithm implemented by the 
controller of a station is to adjust the access probability pi 
with which the station contends. Hence, there needs to be a 
one-to-one mapping between the control signal Pi and pi. In 



addition, we impose that in the optimal point of operation, the 
Pi values of all stations are the same. This latter requirement 
is necessary to obtain the conditions for stability in SectionlTV] 
Based on the above requirements, we design Pi as 

Pi 



Pi 



1 -p, 



(T, + (e - 1)t) 



(24) 



Hence, a station can compute its pi from the control signal 

P^ as 

Ti + (e - 1)t + Pi 



E. Error signal Ei 

The design of the error signal Ei has the following two 
goals: {€} selfish stations should not be able to obtain extra 
channel time from the wireless network by using a configura- 
tion different from the optimal one, and (m) as long as there 
are no selfish stations, p should converge to the optimal p*. 

To this end, each station measures its channel time as well 
as that of the other stations at the end of every interval and 
computes the error signal 

where Ei is a function that we design below. The error signal 
Ei consists of the two components: 

• The first component (X^i^^j ^i ~ ^i) punishes selfish sta- 
tions. If a station i receives less channel time than the 
other stations, this component will be positive and hence 
station i will increase its access probability pi. 
> The second component {Ei) drives the system to the 
desired point of operation in the absence of selfish 
behavior (i.e., when all stations receive the same channel 
time). 
We next address the design of the function Ei. In order to 
drive the current p to the desired p* when ti = tj Vi,j, we 
need Ei > Q for pi > p*, such that in this case pi decreases, 
and Et <0 for p, < p*. 

Another requirement when designing Ei is that selfish 
stations should not be able to obtain more channel time than t*. 
We first consider the case where all stations are well-behaved 
and run the DOC algorithm except one that is selfish. In this 
case, the error signal allows that the selfish station obtains an 
additional channel time equal to Ei : taking Ei = and setting 
ti ~ t for all stations but the selfish one, the channel time tk 
of the selfish station is given as 

tk=t + E, (27) 

As argued before, Ei needs to be small enough such that 

tk < t* (28) 

Combining the two equations above yields 

tk + {N -l)t+{N -l)E,<Nt* (29) 

Using J2j ij = tk + {N — l)t we can isolate Ei and obtain 



E,< 



1 



N - 1 



-D 



(30) 



where D is defined a^ 
D 



FiA 



Nt* 



Y.^j 



(31) 



With this constraint on Fi, the additional channel time 
obtained by a selfish station does not compensate for the 
overall efficiency loss due to sub-optimal access probabilities, 
and hence a selfish station cannot benefit from misbehaving. 

In addition, multiple selfish stations should not be able to 
gain any aggregated channel time by playing a coordinated 
strategy. We consider m selfish stations and again set the ti 
of the other stations equal to t. From E'i = we have 



and we require that 



m 

.7 = 1 



E^^ 



mt + Fi 



<mt* 



(32) 



(33) 



Combining the above equations and isolating Fi yields 

m 



F< 



-D 



(34) 



N -m 

Eqs. ( l30l l and ( l34b provide the maximum value for Fi that 
still prevents one or more selfish stations to benefit from 
misbehaving. Given all these requirements, we design Fi as: 

l)D,D/N), Pt>pr"- 

l)D, -D/N, {N - 1)A), p, < p™" 

(35) 

where p™'" = {p™*", . . . ,p^'"} are the access probabilities 
that minimize D subject to ti = tj Vi^j and A is the value 
that D takes at this point. 



F = 



inin((A^ 
inin((A^ 



A 



^lp=P" 



(36) 



The above design satisfies Eqs. ( |30] l and ( [34l i and fulfills 
Fj > for pi > p* and F^ < for pi < p* (when 
ti = tj \/i,i). It thus meets all the requirements set above for 
function F,;. Note that the term D/N ensures that Eqs. (|30] | 
and (|34] | are satisfied when £> > 0, the term [N ~\)D ensures 
that they are satisfied when D <{), and the terms {N — 1)A 
and —D/N ensure that F < when ti = tj \fi,j and pi < p*, 
as illustrated in Fig. |3] 

Note that we keep Fi very close to the upper bound for 
Pi > Pi, which means that the degree of punishment inflicted 
upon selfish stations is as small as the above requirements 
allow. The rationale for this design is that any punishment in 
the form of an increase in access probabilities affects selfish 
stations and well-behaved stations alike. Providing just enough 
punishment to prevent any throughput gain for selfish stations 
maintains the highest level of overall throughput for the system 
in the presence of malicious users or in transient conditions. 
Following a similar rationale, we keep Fi well below the upper 
bound for pi < p* . 

^Note that D is a function of the efficiency in channel contention, wliich 
depends on p: if channel contention is more efficient, we have a larger number 
of data transmissions in the interval, which results in a larger sum of channel 
times and therefore a smaller D. 
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Fig. 3. Fi as a function of pi when ti = tj '^i,j. 



This concludes the design of the algorithm. In the following 
two sections, we analytically evaluate its performance when 
all stations are well-behaved (Section HvT i and when some of 
them behave selfishly (Section |V]i. 

IV. DOC Analysis 

We first analyze the wireless system under steady state 
conditions and show that it is driven to the desired point of 
operation obtained in Section HI] We then conduct a transient 
analysis and derive the sufficient conditions for stability. 

A. Steady state analysis 

Since the controller includes an integrator, there is no steady 
state error ITSl and the steady solution can be obtained from 



F, = Vi 



(37) 



Using Eqs. ( |26] | and ( l35T l, F; can be computed from ti and 
t*, which allows expressing Eq. ( |37] | as a system of equations 
of p. The following theorem guarantees the uniqueness of this 
system of equations and shows that the unique stable point in 
steady state is the desired point of operation from Section Hll^l 

Theorem 3. The unique stable point of operation of the system 
in steady state is p = p*. 

Proof: Let us consider two stations i and j. From Eq. (|37] | 
we have Ei — Ej = 0, which yields 



Nt, 



Fj -Nt,-F,^0 



(38) 



Note that tj > ti implies Fj > Fi, and vice versa. 
Therefore, the above requires that ti ~ tj yi,j. Substituting 
this into F; — yields F; = 0. Given ti — tj, Fi is an 
increasing function of pi that crosses at pi = p*. Hence, the 
only Pi that satisfies Fi = is p*. Since this holds for all i, 
the unique stable point of operation is pi = p* Vi. ■ 



B. Stability analysis 

We next conduct a stability analysis of DOC to configure 
the parameters of the PI controller. Following the definition 
of a PI controller flSl , station i computes the value of Pi 
at interval Q' as a function of the error values measured by 

^ While the existence of a unique point of operation can be easily guaranteed 
in a centralized system where the configuration of all stations is imposed by a 
central entity, it is much harder to guarantee in a distributed system in which 
each station chooses its own configuration. 
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the station in the current and previous intervals based on the 
following equation: 



e'-i 



P,(e') = KpE,{e') + K\ V ^,(6) 



0=0 



(39) 



where Kp and Ki are the parameters of the controller that we 
have to configure. 

The DOC system shown in Fig. |2] can be expressed in the 
form of Fig.|4] In this figure, C represents the function imple- 
mented by the controllers, which computes the control signals 
Pi taking as input the error signals Ei, and H represents the 
wireless system which provides the error signals Ei measured 
by the stations based on the control signals Pi. The control and 
error signals in the figure are given by the following vectors: 



(Pi 



,P. 



N 



and 



E = (£^1, . . . ,En) 



(40) 



(41) 



Our control system consists of one PI controller in each 
station i that takes Ei as input and gives Pi as output. 
Following this, we can express the relationship between E 
and P as follows 

P(z) = C ■ E(z) (42) 



where 



C = 



/ Cpi{z) 







Cpiiz) 

Cpi{z) 



(43) 



V ... Cpi{z) J 

with Cpi{z) being the z transform of a PI controller ifTSI . 

K,, 



Cpi{z) = Kp 



z-\ 



(44) 



In order to analyze our system from a control theoretic 
standpoint, we need to characterize the wireless system with a 
transfer function H that takes P as input and has E as output. 
Eq. ( |26] | gives a nonlinear relationship between E and P. In 
order to express this relationship as a transfer function, we 
linearize it when the system suffers small perturbations around 
its stable point of operation. We then study the linearized 
model and force that it is stable. Note that the stability of the 
Unearized model guarantees that our system is locally stableO 

*A similar approach was used in 1161 to analyze RED from a control 
theoretic standpoint. 



We express the perturbations around the stable point of 
operation as follows: 



(5P 



(45) 



where P* is the stable point of operation as given by Eq. (l24l) 
with p = p*. 

With the above, the perturbations suffered by E can be 
approximated by 

(5E = // • (5P 



where 



H 



dPi dP2 



Ml. \ 

El ^ 

dPN 



(46) 



(47) 



dEM I 



dP, 



\ SEn dE„ 
\ 9Pi 9P2 

In order to compute these partial derivatives we proceed as 
follows. The error signal Ei can be expressed as 

EjV. (P..,('r,+(e-l)r)-p,..(T, + (e-l)r)) 



E, = Tt 



total ' 



-F, (48) 



The above can be rewritten as a function of P given by 

E,^, (P, - Pr 



E, = Ti 



total 



E.P, 



(e-l)T- 



i-Ps 



F,, 



(49) 



3 p^y- ' ■ pe 

where pe = Y[x^^Pi- 

We start by showing that dFi/dPi = at the stable point 
of operation. It follows from Eq. (|35] | that 



dP, 
D can be expressed as 







dP 
dP 



= 



D = Nt* - Ti 



total " 



T.xPs,'iFt+Ps{e~ l)r 



(50) 



(51) 



J2iPs.iTx + (1 -Ps)t 
The partial derivative of D can be computed as 
dD _ dD dpx 

Taking the partial derivative of Eq. ( fSTT l with respect to p 
and evaluating it at the stable point of operation yields 



(52) 



— = Ttotal 



^1 «> 



Since ps takes a maximum at the stable point of operation, 
we have that dps/dpi = 0, which yields dD/dPi = and 
hence 

The partial derivative of Ei evaluated at the stable point of 
operation can then be computed from Eq. ( |49] ) as 



^^ ^ -{N - l)Ttotal^-p- 

Following a similar reasoning, it can be seen that 
dE, „ 1 



dP, 



T, 



total 



E, P3 



(55) 



(56) 



Substituting these expressions in matrix H gives 
( -{N-l) 1 ... 1 

H = Kh 



1 



-{N-l) 



1 



V 



1 



-{N- 



where 



K„ = T, 



1 



total ' 



1)/ 

(57) 
(58) 



With the above, we have the Hnearized system fully charac- 
terized by matrices C and H. The next step is to configure the 
Kp and Ki parameters of this system. The following theorem 
provides the sufficient conditions of {Kp, Ki} for stability; 

Theorem 4. The linearized system is guaranteed to be stable 
as long as Kp and Ki meet the following conditions: 

1 



K, < K 



p 



K, > 2K. 



NKh 
1 



(59) 



(60) 



NKh 

Proof: According to (6.22) of JTll . we need to verify that 
the following transfer function is stable 



(I 
Computing the above matrix yields 

b b 
a b 



(61) 



fa 
b 
b 



\ b b b 



b 
b 

a J 



(62) 



where 



Cpijz) , , 
a = — Tz — I 1 



N 
Cpi{z) 



N ~1 



1- 



1 + Nz-^KhCpi{2 
1 



(63) 



(64) 



N y 1 + Nz-^KhCpi{z)^ 
Rearranging the terms of the above two equations, we obtain 

Piiz) 



z^ + aiz + 02 

P2{Z) 

z^ + a\z + 02 



(65) 



(66) 



where P\{z) and P2{z) are polynomials and 

ai = NKnKp - 1 (67) 

a2 = NKH{K,-Kp) (68) 

According to Theorem 3.5 of ifTTl . a sufficient condition for 
the stability of a transfer function is that the zeros of its pole 
polynomial (which is the least common denominator of all 
the minors of the transfer function matrix) fall within the unit 
circle. Applying this theorem to (/ — z~^CH)^^C yields that 
the roots of the polynomial z^ + oiz + 02 have to fall inside 
the unit circle. This can be ensured by choosing coefficients 



{ai, 02} that satisfy the following three conditions ifTTl : 02 < 
1, oi < 02 + 1 and fli > —1 — 02. The third condition is 
satisfied as long as Ki > 0, while the other two yield Ki < 
Kp + I /{NKh) and Ki > 2Kp - I /{NKh), respectively. ■ 
In addition to guaranteeing stability, our goal in the config- 
uration of the {Kp, Ki} parameters is to find the right tradeoff 
between speed of reaction to changes and oscillations under 
steady conditions. To this end, we use the Ziegler-Nichols rules 
ifTSl . which have been designed for this purpose. First, we 
compute the parameter K^, defined as the Kp value that leads 
to instability when Ki = 0, and the parameter Ti, defined as 
the oscillation period under these conditions. Then, Kp and 
Ki are configured as follows: 



Kp = OAK, 



and 



K, 



K„ 



(69) 



(70) 



0.85T, 

In order to compute A'„ we proceed as follows. From 
Eq. (|59] | with Ki = we have the following condition for 
stability 

^^ < 2ivk '''^ 

We take A'„ as the value that may turn the system unstable 

1 



K, 



and set Kp according to Eq. 



A- 



2NKh 



0.4 



(72) 



(73) 



2NKh 

With the Kp value that leads to instability, a given set of 
input values may change their sign at most every interval, 
yielding an oscillation period of two intervals (Ti = 2). Thus, 
from Eq. dTOJi, 

/ 1 \ 0.4 
^' - (0:85:2) 2NK^ ^''^ 

which completes the configuration of the PI controller param- 
eters. The stability of this configuration is guaranteed by the 
following corollary: 

Corollary 1. The Kp and Ki configuration given by Eqs. ( 1731 ) 
and i74\l is stable. 

Proof: It is easy to see that Eqs. ( l73T l and ( f74] i meet the 
conditions of Theorem 4. ■ 

V. Game Theoretic Analysis 

In the previous section we have seen that, when all stations 
follow the DOC algorithm, they all play with pi = p* and 
Ri = Rl- In this section we conduct a game theoretic analysis 
to show that one or more stations cannot gain any profit 
by deviating from DOC. In what follows, we say that a 
station is honest or well-behaved when it implements the DOC 
algorithm to configure its pi and Ri parameters, while we 
say that it is selfish or misbehaving when it plays a different 
strategy from DOC to configure these parameters with the aim 
of obtaining some gains. 



The game theoretic analysis conducted in this section as- 
sumes that users are rational and want to maximize their 
own benefit or utility, which is given by the throughput. 
The model is based on the theory of repeated games |fT9l . 
With repeated games, time is divided into stages and a player 
can take new decisions at each stage based on the observed 
behavior of the other players in the previous stages. This 
matches our algorithm, where time is divided into intervals and 
stations update their configuration at each interval |f| Like other 
previous analyses on repeated games EOl . II2TI . we consider 
an infinitely repeated game, which is a common assumption 
when the players do not know when the game will finish. 

A. Single selfish station 

While the design of the DOC algorithm in Section Hill 
guarantees that a station cannot gain any profit by playing 
with a fixed selfish configuration, selfish stations might still 
gain by varying their configuration over time. As an example, 
let us consider a naive algorithm that only takes into account 
the stations' behavior in the previous stage. While this algo- 
rithm may be effective against a fixed selfish configuration, 
it could easily be defeated by a selfish station that alternates 
a selfish configuration {pk = 1,/^fe = 0) with an honest one 
iPk = PkT^k = Rk) ^t every other stage. Since this station 
would play selfish when all the others play honest, it would 
achieve a significantly higher throughput every other interval, 
thus benefiting from its misbehavior. 

The above example shows that it is important to make sure 
that a selfish station cannot gain any profit no matter how 
it varies its configuration over time. The following theorem 
confirms the effectiveness of DOC against any (fixed or 
variable) selfish strategy. The proof of the theorem relies on 
the integrator component of the PI controller, which keeps 
track of the aggregated channel time received by all stations 
and can thus be used to guarantee that this aggregate does not 
exceed a given amount. 

Theorem 5. Let us consider a selfish station that uses a pk{Q) 
and Rk{Q) configuration that can vary over time. If all the 
other stations implement the DOC algorithm, the throughput 
received by this station will be no larger than r^ (where r^ is 
the throughput that station k receives when all stations play 
DOC). 

Proof: The PI controller computes Pi at a given interval 
Q' according to the following expression: 

p,(e') = p™*"' + Kp[y^ fe(e') - uie')) - p,(e') 

(75) 



Note that with the above. Pi will stay between and a 
given maximum value p™°^. If at some time Pi reaches a 
pmax y^J^g gm^jj j-jj^]- p^ = i^ then we have tj ~ for j ^ i 
and Fi > ~{N — l)ti, which yields Ei < and therefore Pi 
decreases. Similarly, if at some time Pi reaches 0, then ti = 
and F < 0, which yields Ei > and therefore Pi increases. 

Considering that < Pi{0') < p™°^, the above equation 
can be expressed as 

where JsT is a bounded value. 

Let us consider the case in which there is a selfish station 
that changes its configuration over time and receives a channel 
time ife(8) while the other stations are well-behaved and use 
the same configuration obtaining the same channel time t{Q). 
Then the above can be expressed as 

^ffc(e) = ^(t(e) + p,(e)) + A- (77) 

e e 

Let us consider now a given interval O. From Eq. ( l30l ) we 
have 



F,ie) < 



1 



(t* ^ tkie) - (N - l)tiQ)) (78) 



N -1 
which yields 

{N - i)<(e) + tkie) + {N- i)F,(e) < m* (79) 

Since the above equation is satisfied for all O, 

^ (TV - i)t(e) + tfc(e) + iN- i)F,(e) < ^ m* 

e 

(80) 

Furthermore, from Eq. ( ITTT i. 

{N-i)Y,tk{e)^{N-i)Y,{t{e) + F,{e))+{N-i)K 

e e 

(81) 

Adding the above two equations yields 

7V^tfc(e) < Ar^<* + (iV-l)X (82) 



from which 



E^^(0)^E^* + ^^ 



'Note that the game theoretic study conducted in Section UlI- Al was based 
on static games instead of repeated ones. The reason is that in Section IIII-AI 
we considered a system without penalties and hence we could model it as a 
static game where all players only make a single move at the beginning of 
the game and (as they are never penalized) do not need to make any further 
move duiing the rest of the game. 



(83) 

/v 
e e 

If we consider a very long period of time, the constant term 
in the above equation can be neglected and we obtain 

J2tk{e)<Y,t* (84) 

t t 

From the above, we have that the selfish station cannot 
receive more channel time with a selfish strategy than by 
playing DOC and, following the reasoning of Section IIII-BI 
therefore cannot obtain more throughput than it would obtain 
by playing DOC, i.e. 

Tk < rl (85) 

which proves the theorem. ■ 

From the above theorem follows Corollary |2] 
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Corollary 2. A state in which all stations play DOC fAll- 
DOCj is a Nash equilibrium of the game. 

Proof: According to Theorem 5, if all stations but one 
play DOC, then the best response of this station is to play 
DOC as well since it cannot benefit from playing a different 
strategy. Thus, All-DOC is a Nash equilibrium. ■ 

This shows that, if all stations start playing with no previous 
history, then none of them can gain by deviating from DOC. In 
addition, in repeated games it is also important to ensure that, 
if at some point the game has a given history, a selfish station 
cannot take advantage of this history to gain profit by playing a 
strategy different from DOC. The following theorem confirms 
that All-DOC is a Nash equilibrium of any subgame (where 
a subgame is defined as the game resulting from starting to 
play with a certain history). Therefore, a selfish station cannot 
benefit by deviating from DOC independently of the previous 
history of the game. 

Theorem 6. All-DOC is a subgame perfect Nash equilibrium 
of the game. 

Proof: Since the proof of Theorem 5 is independent of 
the past history and can therefore be applied to any subgame, 
All-DOC is a Nash equiUbrium of any subgame. ■ 

B. Multiple selfish stations 

The above results show the effectiveness of DOC against 
a single selfish station. In the following, we tackle the case 
when there are multiple selfish stations. 

The following theorem shows that, by following a strategy 
different from DOC, multiple stations cannot gain any aggre- 
gated channel time. 

Theorem 7. Let us consider a scenario with m selfish stations. 
If all other stations play DOC, the selfish stations cannot gain 
any aggregated channel time. 

Proof: Without loss of generality, let us consider that 
stations i = {!,..., tti} are selfish. Applying a reasoning 
similar to Theorem 5 leads to 

771 

^^i^(e)<m^r (86) 

i=i e e 

As the left hand side of the above equation is the aggregated 

channel time obtained by the selfish stations, and the right 

hand side is the aggregated channel time that they would obtain 

if the played DOC, this proves the theorem. ■ 

The above theorem shows that, if there is some selfish 

station that experiences a gain, this is because there is some 

other station that suffers a loss. 

Corollary 3. Let us consider a scenario with m selfish 
stations. If all other stations play DOC and a selfish station k 
receives a throughput larger than r^., this means that there 
exists another selfish station I that receives a throughput 
smaller than r^ (where r^ and r^ are the throughputs obtained 
by stations k and I if all stations played DOC). 

Proof: If there is some station k G {1,...,™} for 
which rk > r^, then we have that this station receives more 



channel time than it would receive if all stations played DOC. 
Since, according to Theorem 7, the selfish stations cannot 
gain any aggregated channel time, this means that there must 
necessarily be some other station / G {!,..., m} that receives 
less channel time. This implies that r; < rj*, which proves the 
corollary. ■ 

Based on the above, we argue that DOC is effective against 
multiple selfish stations, since two or more selfish stations 
cannot simultaneously gain profit and therefore do not have an 
incentive to play a coordinated strategy different from DOC. 

VI. Performance Evaluation 

In this section we evaluate DOC by means of simulation 
to show that (?) in the absence of selfish stations, DOC pro- 
vides optimal performance while behaving stably and reacting 
quickly to changes, and (m) selfish stations cannot benefit by 
following a strategy different from DOC. 

Unless otherwise stated, we assume that different obser- 
vations of the channel conditions are independent, and the 
available transmission rate for a given SNR is given by the 
Shannon channel capacity: 



R{h) = Wlog2{l + p\h\^) bits/s 



(87) 



where W is the channel bandwidth, p is the normalized 
average SNR and h is the random gain of Rayleigh fading. 

We implemented the DOC algorithm in OMNET+-I0. In the 
simulations, we set W = 10^, T /t = 10 and the interval of 
the controller Ttotai ~ lO^r. For all results, 95% confidence 
intervals are below 0.5%. 



A. Throughput evaluation 

For the throughput evaluation, we compare the performance 
of DOC to the following approaches: (i) the static optimal 
configuration obtained in Section HI] {'static configuration'), 
(ii) the configuration proposed in fS] {'DOS'), and (Hi) 
an approach that does not perform opportunistic schedul- 
ing but always transmits after successful contention {'non- 
opportunistic'). 

We consider a scenario with A^ = 10 stations, half of 
them with a normalized SNR of pi = 1 and the other half 
with a normalized SNR p2 that varies from 1 to 10. Fig. |5] 
shows ^ilog{ri), the metric that proportional fairness aims 
at maximizing, as a function of p2- We observe that DOC 
performs at the same level as the benchmark given by the 
static configuration, while the other two approaches {DOS and 
non-opportunistic) provide a substantially lower performance. 

For above scenario with p2 = 4, Fig. |6] depicts the 
individual throughput allocation of two stations (where ri is 
the throughput of a station with pi and r2 that of a station with 
P2). DOC is effective in driving the system to the optimal 
point of operation and provides the same throughput as the 
static configuration. In contrast, the DOS approach exhibits a 
high degree of unfairness and provides the station with high 
SNR with a much higher throughput. The non-opportunistic 
approach provides a good level of fairness but has lower 



^http : / /www . omnetpp . org/| 



11 



143 

142 

141 

140 

139 h 

138 

137 

136 

135 

134 



static configuration 

DOS 

non-opportunistic 

DOC 



5 6 

P2 




Fig. 5. Proportional fairness as a function of SNR (pi = 1, 1 < P2 < 10). Fig. 7. Throughput of a selfish station for fixed configurations of {pk, Rk}- 
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Fig. 8. Selfish station with fixed configuration for different A^ and p2 values. 



throughput due to the lack of opportunistic scheduling. In 
conclusion, the proposed DOC algorithm provides a good 
tradeoff between overall throughput and fairness. 

B. Selfish station with fixed configuration 

We verify that a station cannot obtain more throughput with 
a selfish configuration than by playing DOC in a scenario with 
A^ = 10 stations, 5 of them with pi = \ and the other half 
(including the selfish station) with p2 = 4. The selfish station 
uses a fixed configuration and all other stations implement 
DOC. Fig. |7] shows the throughput of the selfish station for 
different {pfc, Rk} configurations of the selfish station. This is 
compared to the throughput that the station would obtain if it 
played DOC, given by the horizontal line. 

We observe that none of the selfish configurations provides 
more throughput than DOC. Furthermore, r^ is far from r^ for 
Pk < Pk ^nd close to r^ for pk > pi- This is a consequence 
of the design of Fi in Section III.E. For pk < P^, the access 
probabilities of the honest stations satisfy pi < p*. With these 
values of p, Fi takes negative values that are large in absolute 
terms, which means that, according to Eq. (|27] |. the selfish 
station receives much less channel time than the other stations 
and hence a throughput far from r^.. For pk > p%, we have 
Pi > Pi- These p lead to Fi values that are close to the upper 
bound and, as the upper bound corresponds to tk = t*, this 



gives a throughput close to r^ for the selfish station. In Section 
IVI-EI we show that this design leads to a robust behavior 
against selfish stations and transient conditions. 

Fig. m analyzes the impact of fixed selfish configurations 
for a range of different N and p2 values. It shows the largest 
throughput that a selfish station can receive with a fixed 
configuration, which is obtained by performing an exhaustive 
search over the {pk, Rk} space. This throughput is compared 
against the one that the station would receive if it played DOC. 
Again we observe that the station never benefits from playing 
selfishly, which validates the design of the DOC algorithm. 

C. Selfish station with variable configuration 

According to Theorem 5, a selfish station cannot benefit 
from changing its configuration over time. For verification, 
we evaluate the throughput obtained by a selfish station with 
different adaptive strategies. These strategies are inspired by 
the schemes used in ll20l . Il22l for a similar purpose. The 
underlying principle of all of them is that the cheating station 
uses a selfish configuration to gain throughput and, when it 
realizes that it is not gaining throughput, it assumes that it 
has been detected as selfish and switches back to the honest 
configuration to avoid being punished. 

In particular, we consider the following strategies. The 
'adaptive pk strategy' fixes the Rk configuration of the selfish 
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Fig. 10. Throughput obtained by multiple selfish stations. 



Station to its optimal value, Rk ~ Rl, and modifies the pk 
configuration as follows: the station uses a selfish configura- 
tion of Pk = 1 as long as it obtains some gain, i.e. rk > r^- 
When rk drops below r^, the station switches to the honest 
configuration, pk = P*k, and stays with this configuration as 
long as Tk stays below 0.95r^. It switches back to p^ = 1 when 
rk grows above 0.95r^. The 'adaptive Rk strategy' fixes the 
Pk configuration to the optimal value, pk — p^, and modifies 
the Rk configuration following a strategy similar to the one 
above: the station uses a selfish configuration of Rk — (i.e., it 
uses all transmission opportunities) as long as it obtains some 
gain and switches to the honest configuration when it stops 
benefiting. Finally, the 'adaptive pk and Rk strategy' follows 
a similar behavior to the previous ones but adapts both the pk 
and the Rk configuration. 

Fig. |9] compares the throughput obtained with each of the 
above strategies against the one with DOC for different values 
of N. As expected, when all other stations play DOC, a given 
station maximizes its payoff playing DOC as well, as it obtains 
a larger throughput than with any of the other strategies, 
confirming the result of Theorem 5. 

D. Multiple selfish stations 

Corollary 3 states that multiple selfish stations cannot si- 
multaneously benefit by deviating from DOC, as it is only 
possible that one or more of the selfish stations experience 
some throughput gains if there are some other selfish stations 
that suffer some loss. 

To validate the result, we consider a network with A^ = 10 
stations including two selfish stations, half of them (including 
one of the selfish stations) with pi = 1 and the other 
half (including the other selfish station) with p2 = 4. We 
perform an exhaustive search over a wide range of {pi , Rj } 
configurations of the two selfish stations. The results of this ex- 
periment are depicted in Fig. [TOl which shows the throughput 
obtained by the two selfish stations (r^ and ri) for each of the 
configurations used in the exhaustive search. The figure also 
shows the throughput of the two stations when they both play 
DOC. There is no configuration that simultaneously improves 
the throughput of the two selfish stations, which confirms the 
result of Corollary 3. 



We also observe from the figure that the region of feasible 
allocations has a tringular shape. This is a consequence of 
Theorem 7: since the maximum aggregated channel time that 
the two stations can obtain is fixed, any throughput increase 
in one station leads to a decrease in the other station of the 
same amount scaled by a constant factor that depends on the 
respective radio conditions. 

E. Robustness to selfish behavior and transient conditions 

For a setting similar to that of Fig. [7] with 10 stations, 
half of them with pi = 1 and half with p2 = 4, and 
one selfish station with a fixed configuration and p2 ~ 4, 
we investigate the overall throughput of the wireless system. 
Again, the throughput obtained when all stations play DOC 
is given by the horizontal line. From Fig. [TT]we see that the 
overall throughput is close to optimal for low values of the 
access probability pk of the selfish station and only gradually 
decreases for high values of pk- For low values of pk, well- 
behaved stations contend with a higher access probability than 
the selfish station which yields an almost optimal throughput. 
For high values of pk, the selfish station has to be punished 
which unavoidably results in some throughput loss. However, 
the level of punishment is minimized to avoid driving the 
collision probability to unnecessarily high levels that harm 
the overall throughput. Hence, even for very high pk and the 
subsequent high rate of contention collisions, some throughput 
remains for the well-behaved stations (note from Fig. |7] that 
the maximum throughput of the selfish station is less than 
'^l.S Mbps). We conclude that, as intended, the design of Fi 
maintains a level of throughput as high as possible for the 
well-behaved stations. 

In addition to robustness against selfish behavior (as seen 
above), our design of Fi also aims at providing robustness in 
transient conditions. We investigate this through the following 
experiment: in a wireless network with 10 stations, a new 
station joins every 100 intervals, with its pi initially set to 
0.5, stays for 50 intervals and then leaves the system. With 
our design of Fi the total throughput obtained in this scenario 
is equal to 9.67 Mbps, while with a design of Fi 10 times 
smaller, it is only 6.52 Mbps, confirming the robustness of 
DOC to transient network conditions. 
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Fig. 11. Total system throughput in the presence of a selfish station. 
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Fig. 12. Stability analysis of the parameters of the PI controller 




Fig. 13. Speed of reaction provided by the parameters of the PI controller. 
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Fig. 14. Performance with Jakes' channel model. 



F. Parameter setting of the PI controller 

The main objective in the setting of the Kp and Ki pa- 
rameters proposed in Section |IV] is to achieve a good tradeoff 
between stability and speed of reaction. 

To validate that our system guarantees a stable behavior, we 
analyze the evolution over time of the throughput received by 
a station for the chosen {Kp,Ki\ setting and a configuration 
of these parameters 10 times larger, in a wireless network 
with A^ = 10 stations. We observe from Fig. [12] that with 
the proposed setting (labeled "A'p, K"), the throughput shows 
only minor deviations around its average value, while for a 
larger setting (labeled ''Kp * 10, Ki * 10"), it shows unstable 
behavior with drastic oscillations. 

To investigate the speed with which the system reacts 
against selfish stations, we use a wireless network with 
A^ = 10 stations where initially all stations play DOC and, 
after 50 intervals, one station turns selfish and changes its 
access probability to pk — 1. Fig. [T3] shows the evolution of 
the throughput of the selfish station over time. We observe 
from the figure that with our setting (labeled "A'p, K"), 
the system reacts quickly, and after a few tens of intervals 
the selfish station no longer benefits from its behavior. In 
contrast, for a setting of these parameters 10 times smaller 
(labeled "_K'p/10,ii'i/10"), the reaction is very slow and it 
takes almost 2000 intervals until the station stops benefiting 
from its misbehavior 

The results show that with a larger setting of {Kp, Ki] 
the system suffers from instability while with a smaller one it 
reacts too slowly. Hence, the proposed setting provides a good 



tradeoff between stability and speed of reaction. 

G. Impact of channel coherence time 

Our channel model is based on the assumption that different 
observations of the channel conditions are independent. In 
order to understand the impact of this assumption, we repeat 
the experiment of Fig. fusing Jakes' channel model 123)1 to 
obtain the different channel observations. The results, for a 
Doppler frequency of fo ~ 27r/100r, are given in Fig. [14] 
We observe that the throughput obtained is slightly smaller 
than that of Fig. [6] This is due to the fact that when the 
channel is bad, a station does not transmit after a successful 
contention and therefore it takes (on average) a shorter time 
until the next successful contention of this station. As a result, 
a station accesses more often the channel when it is bad than 
when it is good, which introduces a bias that slightly reduces 
the throughput. Overall, the results are sufficiently similar to 
those of Fig. |6]to conclude that our assumption on the channel 
model only has a minor impact on the resulting performance. 

We further investigate whether, in the above scenario, a 
station with p2 = 4 could obtain more throughput by using a 
selfish configuration. While the station obtains 1.752 Mbps 
with DOC, it can obtain up to 1.757 Mbps with a selfish 
configuration. Note that this increase is not due to the DOC 
design, as no configuration gives more channel time to the 
selfish station, but rather due to the fact that the transmission 
rate threshold of |[3l is not truly optimal under Jakes ' channel 
model. In any case, the throughput gain of the selfish station 
is negligible. 
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Fig. 15. Thi'oughput comparison for a discrete set of rates. 



H. Discrete set of transmission rates 

While all previous experiments assumed continuous rates, 
our analysis as well as the design of the DOC algorithm 
does not rely on any assumption on the mapping of SNR to 
transmission rates and therefore works for any (continuous or 
discrete) mapping function. To show that DOC is effective 
when only a set of discrete rates is allowed, we analyze a 
wireless system in which the only transmission rates available 
are {1,2,5.5,12,24,48,54} Mbps. For a given SNR, we 
choose the largest available transmission rate that is smaller 
than the one given by Eq. dSTl ). 

We repeat the experiment of Fig. |9] with discrete rates, and 
compare the throughput of a selfish station against the through- 
put that this station obtains when it plays DOC. The results 
in Fig. [15] confirm that a station cannot benefit from playing 
selfish. We further observe that, as expected, throughputs are 
smaller than those of Fig. |9] since, with the discrete mapping 
of SNR to rates, smaller transmission rates are achieved on 
average. 

VII. Conclusions 

Recently proposed Distributed Opportunistic Scheduling 
(DOS) techniques provide throughput gains in wireless net- 
works that do not have a centralized scheduler. One of the 
problems of these techniques is, however, that they are vulner- 
able to maUcious users which may configure their parameters 
to obtain a greater share of the wireless resources at the 
expense of other, well-behaved, users. In this paper we address 
the problem by proposing a novel algorithm that prevents such 
throughput gains from selfish behavior 

With our approach, upon detecting a selfish user, stations 
react by using a more aggressive parameter configuration 
which serves to punish the selfish station. Such an adaptive 
algorithm has to carefully adjust the reaction against a selfish 
station to avoid that the system turns unstable by overreacting. 
A key aspect of the paper is that we use of tools from the fields 
of multivariable control theory combined with game theory in 
the design of our algorithm. 

We conducted a control theoretic analysis of the DOC 
algorithm that shows that, when all the stations in the wireless 
network run DOC, the system behaves stably and converges to 



the desired configuration. We then used this control theoretic 
analysis to find a setting that provides a good tradeoff between 
stability and speed of reaction. In addition, we performed a 
game theoretic analysis of DOC based on repeated games to 
evaluate its behavior when there are one or more selfish sta- 
tions in the wireless network. The analysis shows that neither 
a single selfish station nor several cooperating selfish stations 
can benefit from playing a strategy different from DOC, and 
that this holds for fixed as well as for adaptive strategies. 
Furthermore, the DOC strategy represents a subgame perfect 
Nash equilibrium. 
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